Digital Forensics Formats: Seeking a Digital Preservation Storage Container Format for Web Archiving
نویسندگان
چکیده
In this paper we discuss archival storage container formats from the point of view of digital curation and preservation, an aspect of preservation overlooked by most other studies. Considering established approaches to data management as our jumping off point, we selected seven container format attributes that are core to the long term accessibility of digital materials. We have labeled these core preservation attributes. These attributes are then used as evaluation criteria to compare storage container formats belonging to five common categories: formats for archiving selected content (e.g. tar, WARC), disk image formats that capture data for recovery or installation (partimage, dd raw image), these two types combined with a selected compression algorithm (e.g. tar+gzip), formats that combine packing and compression (e.g. 7-zip), and forensic file formats for data analysis in criminal investigations (e.g. aff – Advanced Forensic File format). We present a general discussion of the storage container format landscape in terms of the attributes we discuss, and make a direct comparison between the three most promising archival formats: tar, WARC, and aff. We conclude by suggesting the next steps to take the research forward and to validate the observations we have made. International Journal of Digital Curation (2012), 7(2), 21–39. http://dx.doi.org/10.2218/ijdc.v7i2.227 The International Journal of Digital Curation is an international journal committed to scholarly excellence and dedicated to the advancement of digital curation across a wide range of sectors. The IJDC is published by UKOLN at the University of Bath and is a publication of the Digital Curation Centre. ISSN: 1746-8256. URL: http://www.ijdc.net/ 22 Digital Forensics Formats doi:10.2218/ijdc.v7i2.227
منابع مشابه
From the World Wide Web to Digital Library Stacks: Preserving the French Web Archives
The National Library of France is mandated by French law to collect and preserve the French Internet. It is now a 10-year old project with collections ranging from 1996 to the present. To ensure their long-term preservation, the choice has been made to ingest these web archives into the institution’s existing digital preservation repository, SPAR (Scalable Preservation and Archiving Repository)...
متن کاملEvaluating File Formats for Long-term Preservation
National and international publishers have been depositing digital publications at the National Library of the Netherlands (KB) since 2003. Until recently, most of these publications were deposited in the Portable Document Format. New projects, for example the web archiving project, force the KB to handle more heterogeneous material. Therefore, the KB has developed a quantifiable file format ri...
متن کاملManagement of Storage Devices and File Formats in Web Archive Systems
Many national libraries are making efforts to crawl and store various born-digital information, there are many difficult problems of the social, legal and technical aspects. In this paper, from the view points of long-term preservation of digital contents, we focus on the the urgent task of storage system, since the size of the web archive is increasing exponentially. In order to archive monoto...
متن کاملUsing the Web Infrastructure for Digital Preservation
To date, most of the focus regarding digital preservation has been on removing copies of the resources to be preserved from the “living web” and placing them in an archive for controlled curation. Once inside an archive, the resources are subject to careful processes of refreshing (making additional copies to new media) and migrating (conversion to new formats and applications). For small numbe...
متن کاملChallenges of Long-Term Digital Archiving: A Survey
With an ever-increasing volume of digital records and compliance requirements mandated by regulations, electronic record archiving grows to be more and more important in the digital era. The fundamental functionality of digital archiving includes keeping data content intact and providing provable evidence of events ever happened to the data. The main challenges of long-term digital archiving in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IJDC
دوره 7 شماره
صفحات -
تاریخ انتشار 2012